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We consider the problem of solving least 
squares problems involving a matrix M 
of small displacement rank with respect 
to two matrices Z^ and Z2. We develop 
formulas for the generators of the matrix 
M^M in terms of the generators of M and 
show that the Cholesky factorization of 
the matrix M ^M can be computed quickly 
if Zi is close to unitary and Z2 is triangular 
and nilpotent. These conditions are 
satisfied for several classes of matrices, 
including Toeplitz, block Toeplitz, Hankel, 
and block Hankel, and for matrices whose 
blocks have such structure. Fast Cholesky 
factorization enables fast solution of least 
squares problems, total least squares 
problems, and regularized total least 
squares problems involving these classes 
of matrices. 
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1. Introduction 

In [6], Mastronardi, Lemmerling, and Van Huffel 
present an algorithm for solving fast structured total 
least squares problems of the form 



mm 



i[^ ^]ii; 



(1) 



subject to the constraints 



iA + E)x = y + P 

with ^ e C"""" a given Toeplitz matrix and j e C'^'U 
given vector. They include one additional constraint: E 
is a Toeplitz matrix. They produced a fast algorithm 
for solving this structured total least squares problem 
(STLS) and showed that the solution was a better 



estimator than the solution to the total least squares 
problem without the Toeplitz constraint. 

In this paper, we work through the details of a small 
generalization of this algorithm. We consider the same 
problem (1), but under the constraint that^ and E have 
small displacement rank relative to some matrices Zj 
and Z2. Choosing these two matrices to be shift-down 
matrices and the rank to be two gives the Toeplitz 
constraint considered by [6], but we will be interested 
in other cases as well. See [3] for a discussion of dis- 
placement structure. 

We also consider fast solution of the problem under 
the additional constraint that the norm of the solution 
vector X is specified, a problem posed in Pruessner and 
O'Leary [10]. This corresponds to a Tikhonov regular- 
ization of our structured total least squares problem 
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and results in a fast solution algorithm for the problem 
considered in [8, 9, 10, 7]. 

The core of the algorithm in [6], based on a more 
general algorithm of [11], relies on two results: the 
representation of the generators for the matrix ^^^ that 
appears in the normal equations when A is Toeplitz, and 
then a fast factorization of a matrix derived from these 
generators. So we begin in Sec. 2 with a construction of 
the generators for M^M when M is any matrix of small 
displacement rank and Z^ is close to unitary. In Sec. 3 
we show that it is inexpensive to form a Cholesky 
factorization of M^M whenever Z2 is triangular and 
nilpotent. Section 4 concerns the application of this 
algorithm, and we conclude in Sec. 5. 

The results in this work are derived from [4, 5]. 

2. Generators of M^M 

One way to solve a least squares problem 
min II Mx - b 11 

involving a matrix M of full column rank is to use the 
normal equation formulation; we factor M^M = LL^ 
where Z is a square lower triangular matrix and then 
solve the linear system 



LL''x = M%, 



(2) 



Thus, we can solve this problem fast if we can 
compute a Cholesky factorization of M^M fast. To do 
this, we will first derive a generator for the matrix M^M 
when Me C"^" (m > n) has low displacement rank. 

Suppose that M has low displacement rank relative 
to the matrices Z^e C'"'''" and Z2 e C"""", which means 
that if we define 

N^M-Z,MZ^ , 

then rank (TV), which we denote by Pi, is small relative 
to n. 
Suppose 

z = z,+w 

is a unitary matrix (Z^Z = /), where W has rank p^ , also 
assumed to be small. For example, ifE is Toeplitz, let 
Zi be the shift-down matrix with ones on its subdiago- 
nal and zeros elsewhere, and then ^is the matrix with 
a one in the last position of row 1. 

Then M^M also has low displacement rank relative 
to Z2, as we can see from the identity 



M^'M - Z^M"" MZ"^ ^M^'M-Z^ M"" Z"" ZMZ^ 

=M''M-{M-N+ WMZ^ f (M-N+ WMZ^ ) 
={N-WMZ^f {M-N+WMZ^)+M'' (N-WMZ^). 

Theorem 1. If the rank of A^ = M-ZjMZf is pi and 
if the unitary matrix Z is equal to Z^ + ^ where ^has 
rank P2, then 

M^'M-Z^M'^MZ^^ 

-N"" N+N"" (M+WMZ^)+(M'' +(WMZ^f)N 
-{WMZ^ Y {WMZ^ ) -M"" (WMZ^ ) -^ WMZ^ f M 

has rank at most 2(pi + p^. 

Proof: The equation in the statement of the theorem 
is a regrouping of the terms in the previous equation. 
The rank of A'^- WMZ2 is at most the rank of A'^ plus 
the rank of W, so the rank of the sum in that equation is 
at most 2(pi + p^, [] 

An alternate proof of this theorem can be derived by 
noting that displacement rank is preserved by taking a 
Schur complement [3, Lemma 1.5.2], and M^M is the 
Schur complement of 



-/ M 

M^ 



which has displacement rank at most 2{p^ + p^) relative 
to the matrix 



Z, 

z 



The bound 2(pi + p^) can be an overestimate, since 
the N and W terms may interact. For example, for a 
Toeplitz matrix with rank computed relative to the 
shift-down matrices, the bound yields 6 while the 
actual rank is 4. If we use circular shifts, though, both 
the bound and the actual rank are 4. 

Using the identity 

ab"" +ba'' ^-{a + b){a+bf --{a-b){a-bf , 
we can easily symmetrize the generators. 

3. Determining a Factorization of M^M 
From tlie Generators 

Theorem 1 tells us how to determine p =2 (pi + P2) 
vectors h; so that 



M"" M - Z^M"" MZ^ =Y,^.,h.hf 



(3) 



114 



Volume 1 1 1, Number 2, March-April 2006 

Journal of Research of the National Institute of Standards and Technology 



where s^ equals plus or minus 1 . When Zy and Z2 are 
shift-down matrices, it has been shown [6, 1,2] that 
this implies that 



M"" M = Y,^.L{h.)L{h.f 



= [L(h,),.,L(h^)]S 



L(h,r 



L{h^)\ 



where 5 = diag(5/), 5/ = diag(5'/) {n x n), and L(h^ is the 
lower triangular Toeplitz matrix with first column equal 
to h^. We now generalize this result somewhat. 

3.1 Triangular Factors of Structured Matrices 

Theorem 2. 

IfZy is nilpotent, then 



and only if 



where 



A = L,(g)mh) 



L.(x)=[x Zx ... Z^-^x\, 
Proof: Suppose ^ = Z^(^)Zf(/i). Observe that 



Llg)Ll{}i)^Vg Z,g ... Zr'g} 






h''{zlT\ 



^%zigh^zi 



7=0 



and 



7=0 

SO, since Z\ = 0, we conclude that 

L,{g)L",{h)-Z,L,{g)L",{h)Z^ =gh". 

To prove the converse, suppose A-Z^AZ^=gh^. 
Then, since 

gh^ =L,(g)LUh)-Z,L,(g)mh)Z,\ 

we conclude that ifE = A- L^(g) Zf (h), then 
E = Z,EZ^ . 

Now since Z^ isnilpotent, Zf =Ofor some p<n. There- 
fore, Z^-'E = Z^EZ^ =0, and working backward in 



powers of Z^, we see that Z^E = Z^EZ^ = 0, so 
A = L,(g)LUh). u 

The following corollary can be proved by finite induction. 
Corollary 1: If Z^ is nilpotent, then 



if and only if 



A-Z,AZ^ =J^g,hf 



A = ^L,{g,)LUM 



3.2 Triangular Factors of MM 

In order to solve our least squares problem, we wish 
to determine a Cholesky factorization 

so we need to reduce the matrix 

to upper triangular form, where the vectors h^ are 
defined in (3). 

If Zi and Z2 are shift-down matrices, then [6] shows 
how to do this reduction fast. Using Corollary 1 , we will 
see that this can be done fast whenever Z2 is triangular 
and nilpotent. We present the algorithm for the case Z2 
lower triangular; the upper triangular case is analogous 
but works with the columns in reverse order. 

The algorithm proceeds by columns, putting zeros 
below the main diagonal. Note that 



h,"Z", 

h,"{z^r 



L = 









Kiz",)' 



Suppose we determine a rotation between the first 
row ^f and row n + 1, which contains ^f, to zero the 
first element of ^f . The same rotation between 
h ^{Z^y andh^{Z^y (y = 1, . . . , /2 - 1) also zeroes the 
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first element of ^f (Zf )^ since Zf is upper triangular. 
Therefore, by introducing one zero into our matrix, we 
have implicitly introduced n - 1 more, so we can put 
zeroes below the main diagonal in column 1 by using 
only p- 1 rotations, independent of the size of n. 

We then use the resulting second row, equal to the 
first row postmultiplied by Zf , to zero the second 
element of row n+ \. Again this implicitly introduces 
additional zeros, n-2 of them, and we complete the 
operations on column 2 by using p - 1 rotations. 

If we repeat this for each column, we accomplish 
our reduction. Let Hhc the pxn matrix whose rows are 
h^ . We can thus reduce L to upper triangular form just 
by operating on the matrix H. 

We design our algorithm to use Givens rotations as 
often as possible, minimizing the number of hyper- 
bolic rotations in order to preserve stability. A Givens 
rotation can be used between row i and rowy when- 
ever s^ and Sj (see equation (3)) have the same sign; if 
they have different signs, then we must use a hyper- 
bolic rotation. We'll assume that we have ordered 
the generators so that the first p rows of H have s^ = 1 
and the remaining ones have s^ = - I. 

Algorithm Reduce (Hf 

Fory=l,. . .,77, 

For/ = 2,. ..,p, 

If hij is nonzero, then 

zero it by a Givens rotation between 
row 1 and row i; 

end for 

For / = p + 2, . . . , p, 
If hij is nonzero, then 

zero it by a Givens rotation between 
row p + 1 and row /; 

end for 



There is an analogous algorithm, FTriang, in [6], for the special case 
in which M is Toeplitz, but it has some typographical errors. In the 
statement following "if i<m", g3 on the left-hand side should be g4. In 
12 places on p. 552, "m+n" should be "mnl". Also, the numbering of 
the phases of the computation is off by one compared with the descrip- 
tion in the paper ("Initialization" should be "Phase V\ etc.). 



If h^^^j is nonzero, then 

zero it by a stabilized hyperbolic rotation 
between row 1 and row p + 1 ; 

Then the y th row of L^ is h ^, the first row 
of the current H matrix. 

Replace the first row of /T by h ^ Z^ to 
form the pivot row for the next value of y. 

end for 

The cost of this reduction is at most O(pn^), ignoring 
sparsity plus the cost of the multiplications by Z2. 
Sometimes sparsity can reduce this cost significantly 
[7]. Without exploiting the structure of L the cost 
would be O(prP). Once the factors LL^ are computed, 
they can then be used to solve (2). 

4. Some Applications 

Our fast algorithm enables us to solve least squares, 
total least squares, and regularized total least squares 
problems involving matrices for which Z^ is close to 
unitary and Z2 is triangular and nilpotent. This includes 
several important classes of structured matrices. 

4.1 Solving Least Squares Problems 

Consider first the least squares problem 



mm 



\Mx-bl 



where Me C'"^" has full column rank. We can use 
our algorithm if 

• M is Toeplitz. Then Z^ and Z2 are the shift-down 
matrices of appropriate size and the displacement 
rank is 2. 



• M is block Toeplitz: 



M = 



M, 



M.-i 



^5^1 ^5 



M, 



^5^y-l ^5^y-2 "' ^S 



where M§ has dimension m / yx n / S. Then 
choosing Zi e C'^'as 
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/ 








/ 



where each block has dimension m I yxm I y, and 
choosing Z2 e C'"^" similarly but with blocks of 
dimension n/ 5xn/ 5 gives a displacement rank 

ofm I Y + n I 5. 



M has Toeplitz blocks 



M = 



^11 ^u 



^M ^ 



r.2 






M 



Y,d 



where each block My is Toeplitz. Then choosing 
Zi e C'"^'" as a block-diagonal matrix consisting 
of shift-down matrices of dimensions matching the 
column dimension of the diagonal blocks of M and 
choosing Z2 e C"^" similarly but with blocks 
matching the row dimension gives a displacement 
rank of 2(7+ 5). 

• M Hankel, block Hankel, or having Hankel 
blocks. These cases are analogous to those con- 
sidered above. 

4.2 Solving Structured Total Least Squares Problems 

In some cases the matrix of the problem can be 
estimated only with error, and we need to determine not 
just the parameters x of the least squares fit but also 
the corrections to the operator. This problem can be 
formulated as 



mm 

E,P,x 



[E P]\ 



(4) 



subject to the constraints 



iA + E)x = y + P 

with A and E having the same structure. 

Suppose that the matrix E can be specified by p 
parameters a^, . . . , a^. For example, if £" is a Toeplitz 
matrix, then 



E = 



«„ 


«„-l 


• 


.. a, ' 


a„+i 


oc„ 


• 


.. a. 


^m+n-l 


«..„. 


-2 


.. a„_ 



and p = m + n-\. We rewrite our problem as 



mm 

a,p,x 



a 



(5) 



where 



p = iA + E)x-y. 



Following [6], we have replaced the term \\E\^ by 
a^a, equivalent except for scaling of the entries af . 
We define the matrix X e C"""^ by the equation 

Xa=Ex. 

For example, ifE is Toeplitz, thenjy = m + « - 1 and 

^n \-l ■•■ ^ 



x = 























Following [11], we form a quadratic approximation 
to (5) by using linear approximations a+ Aa and x + Ax, 
resulting in 

P^{A + {E +AE)){x +Ax)-y 

^ {A + E)x + XAa + {A + E)Ax- y 

so that 



X 


A + E' 


Aa 


+ 


I 


J 


[ax\ 





(A + E)x-y 
a 



If we minimize this with respect to Aa and Ax, then 
we can form a new approximation 

a=a+Aa 
X = x+Ax 

to the solution of (5) and then repeat the procedure until 
convergence. As noted by [11], this is a Gauss-Newton 
algorithm applied to (5). Therefore, the main computa- 
tional task is to solve linear least squares problems of 
the form 







Aa 




mm 


M 




+ 


Aa,Ax 




lAx\ 





{A + E)x-y 
a 



(6) 



where 



M = 



X A + E 
I 



If^ is in one of the classes considered in Sec. 4.1, 
then the matrix M has low displacement rank and we 
can solve the problem fast. 
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4.3 Solving Regularized Total Least 
Squares Problems 

In many deblurring problems and other discretized 
problems involving integral equations of the first kind, 
the matrix A is so ill-conditioned that noise in the 
observations y is magnified in solving the STLS prob- 
lem and a meaningful solution cannot be obtained. 

In this case it is necessary to add a regularization 
constraint to the problem. One common regularization 
constraint is to restrict the size of the solution, or some 
linear transformation of the solution: 

II Cv II < w 

where w is a given scalar and C is commonly chosen to 
be the identity matrix or a difference operator. If C has 
low displacement rank relative to Z2 and an appropriate 
Zi, our algorithm can be easily modified to incorporate 
regularization. In this case, our problem (5) can be 
reformulated as 



mm 



a 

XCx 



(7) 



where P=(A+ E)x -y and X, the regularization para- 
meter, is the Lagrange multiplier for the new constraint. 
Using a derivation similar to that above, the lineariza- 
tion of (7) results in the following problem to be solved 
at each step of the iteration: 



X 


A + E' 


Aa 




' ^ 1 


1 





Ax 


+ 


a 





XC J 




XCx\ 



mm 

Aa,Ax 



Thus, our new M matrix is the matrix M from the 
previous section augmented by the extra rows [0, AC], 
and the only change necessary in the algorithm is to 
find the generators of this matrix rather than the old 
one. 

The displacement structure of this matrix is greatly 
simplified if C is upper triangular and Z^ and Z2 are 
shift-down matrices. As noted before, W is zero except 
for a one in the last position of the first row, and 
thus ^M is zero except for a A in the last position 
of the first row. Therefore, WMZ2 = 0, so, applying 
Theorem 1, we have the following result. 

Theorem 3. If C is upper triangular and Zj and Z2 are 
shift-down matrices, then 

M^'M-Z^M'^MZ^ = (M- Nllf N+N^'iM-N/l) 

(8) 



and has rank 2pi, where Pi is the rank of TV. More gen- 
erally (8) holds whenever WMZ^ = 0. 

5. Conclusions 

We have derived the generators for M^M when Mis 
any matrix of small displacement rank relative to Z^ 
and Z2. We have shown that it is inexpensive to form a 
Cholesky factorization ofM^M whenever Z^ is close to 
unitary and Z2 is triangular and nilpotent, and we have 
generalized this algorithm when a regularization 
constraint is to be applied. 
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